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- The MAILING DATE of this communication appears on the cover sheet with the correspondence address - 
Period for Reply 



A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

Responsive to communication(s) filed on 20 September 2004 . 
2a)D This action is FINAL. 2b)l3 This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quay/e, 1935 CD. 1 1, 453 O.G. 213. 

Disposition of Claims 

4) S Claim(s) 1-20 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) EI Claim(s) 1-20 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10) D The drawing(s) filed on is/are: a)D accepted or b)Q objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1.85(a). 
Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 

1 1) D The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-152. 

Priority under 35 U.S.C. § 119 

12) D Acknowledgment is made of a claim for foreign priority under 35 U.S.C § 119(a)-(d) or (f). 
a)D All b)D Some * c)D None of: 

1 Certified copies of the priority documents have been received. 

2. Q Certified copies of the priority documents have been received in Application No. . 

3. D Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 
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DETAILED ACTION 

1. This action is responsive to communications: The Amendment filed 09/20/04 to the 
original Application filed on 03/23/01, which claims priority to a provisional application. 

2. The rejection of claims 1-20 under 35 U.S.C 103(a) as being unpatentable over Smadja 
(US: 6,621,930 09/16/03) have been withdrawn as necessitated by Amendment. 

3. Claims 1-20 are pending in the case. Claims 1, 7, and 14 are independent claims. 

Claim Rejections - 35 USC §103 

4. The following is a quotation of 35 U.S.C 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

5. Claims 1-5 and 7-20 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Russell-Falla et al (US: 6,675,162 01/06/04) in view of Weiser et al (US-5,982,507 1 1/09/99). 

-In regards to independent claims 1, 7, and 14, Russell-Falla teaches a computer- 
implemented method comprising a processor (Abstract) and memory (Fig. 1: 30) 
connected to said processor, wherein the method further comprises; 

recognizing a concept (column 2, lines 54-63) that represents a basic idea (content 
category)(column 2, lines 35-39; column 4, lines 32-47) in a document format (column 2, 
lines 35-39; column 3, lines 17-20); and 
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incorporating said concept in a concept model (i.e. "pornographic", "commercial 
solicitations", "racist", "good", "bad", etc)(column 3, lines 39-43 & 60-67; column 8, 
lines 43-45). 

Russell-Falla further teaches wherein the document format could be any number of 
common document formats including an electronic email message, a word processing document, 
hypertext document, and any number of other types of documents (columns 3 & 4, lines 23-26 & 
51-53). Russell-Falla does not teach wherein the initial document format have to be converted to 
one of the common document formats to be processed. Weiser et al teach converting a document 
format (email message) from an email format to a common generic format (column 12, lines 53- 
55). It would have been obvious to one of ordinary skill in the art at the time of the invention for 
Russell-Falla to have converted its initial format document to one of the common document 
formats listed above, because Weiser et teach by doing so the common format can be 
understandable by the document system (column 12, lines 44-56)(i.e. converting document to a 
format able to be processed by the a specific system provides the obvious advantage of being 
able to process the document in that system). 

-In regard to dependent claims 2 and 8, Russell-Falla teaches identifying a plurality of 
features (column 4, lines 59-61: "identify the regular expressions") in said document format, 
wherein said plurality of features represent evidence ("useful in discriminating a specific 
category of information")(column 4, lines 61-66) of said concept in said format. 
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-In regard to dependent claims 3 and 9, Russell- Falla teach calculating a concept weight 
for said concept ("calculating a rating of the page")(column 3, lines 54-57) using a plurality of 
feature weights ("requires a weighting be provided for each word of phrase in the list")(column 
3, lines 46-57) associated with said plurality of features ("regular expressions")(column 2, lines 
55-59; column 8, lines 9-19) wherein said concept weight represents a recognition confidence 
level for said concept (column 3, lines 54-57); 

comparing said concept weight with a predetermined thresholds (column 2, lines 64-67; 
column 3, lines 1-16). 

-In regard to dependent claims 4, 11, 13, and 19, Russell-Falla teaches by referencing 
said concept model (content category) (column 2, lines 35-39), generating an auto- 
attribute/category (column 8, lines 39-51), said auto-attribute/category being a descriptive label 
(i.e. "pornographic", "commercial solicitations", "racist", "good", "bad", etc)(column 3, lines 
39-43 & 60-67; column 8, lines 43-45) for said format/category document. 

-In regard to dependent claims 5, 12, 18, and 20, Russell-Falla teaches by referencing 
said concept model (content category)(column 2, lines 35-39), assigning said document format to 
a subject category/modeling directory (i.e. "pornographic", "commercial solicitations", "racist", 
"good", "bad", etc)(column 3, lines 39-43 & 60-67; column 8, lines 43-45) in a categorization 
taxonomy (column 4, lines 34-45) including a plurality of categories (i.e. "pornographic", 
"commercial solicitations", "racist", "good", "bad", etc)(column 3, lines 39-43 & 60-67; column 
8, lines 43-45). 
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-In regard to dependent claim 10, Russell-Falla teaches incorporating said recognition 
confidence level (category threshold) (column 2, lines 64-67; column 3, lines 1-16) in said 
conceptual model (content category) (column 2, lines 35-39) based on the training data (column 
6, lines 52-67; column 7, lines 1-67). 

-In regard to dependent claim 15, as shown above, Russell-Falla teaches wherein the 
common document format was hypertext (HTML) web pages (column 1, lines 33-37)(Fig. 1: 12) 
or other like information content (column 3, lines 17-22; column 6, lines 25-28; column 8, lines 
20-61: "file directories", "email messages", "database records", "other web pages", etc). 
Russell-Falla does not teach wherein the initial document format have to be converted to one of 
the common document formats to be processed. Weiser et al teach converting a document 
format (email message) from an email format to a common generic format (column 12, lines 53- 
55). It would have been obvious to one of ordinary skill in the art at the time of the invention for 
Russell-Falla to have converted its initial format document to one of the common document 
formats listed above, because Weiser et teach by doing so the common format can be 
understandable by the document system (column 12, lines 44-56)(i.e. converting document to a 
format able to be processed by the a specific system provides the obvious advantage of being 
able to process the document in that system). 

-In regard to dependent claim 16, Russell-Falla teaches separating the text content from 
said initial format document for categorizing documents based on statistical techniques (column 
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2, lines 52-59). As shown above in dependent claim 15, Russell- Falla does not teach converting 
the initial document format into a common document format. Weiser et al teach converting a 
document format (email message) from an email format to a common generic format (column 
12, lines 53-55). It would have been obvious to one of ordinary skill in the art at the time of the 
invention for Russell-Falla to have converted its initial format document to one of the common 
document formats listed above, because Weiser et teach by doing so the common format can be 
understandable by the document system (column 12, lines 44-56)(i.e. converting document to a 
format able to be processed by the a specific system provides the obvious advantage of being 
able to process the document in that system). 

wherein it would have also been obvious to incorporate the text from the initial 
document into the said common document, because Russell-Falla teaches the textual content was 
what was needed to categorize the incoming documents (column 4, lines 57-66). 

-In regard to dependent claim 17, Russell-Falla teaches identifying a plurality of features 
(column 4, lines 59-61: "identify the regular expressions") in said document format, wherein said 
plurality of features represent evidence ("useful in discriminating a specific category of 
information")(column 4, lines 61-66) of said concept in said format. Russell-Falla further teaches 
calculating a concept weight for said concept ("calculating a rating of the page")(column 3, lines 
54-57) using a plurality of feature weights ("requires a weighting be provided for each word of 
phrase in the list")(column 3, lines 46-57) associated with said plurality of features ("regular 
expressions")(column 2, lines 55-59; column 8, lines 9-19), wherein said concept weight 
represents a recognition confidence level for said concept (column 3, lines 54-57); and 
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comparing said concept weight with a predetermined thresholds (column 2, lines 
64-67; column 3, lines 1-16). 

6. Claim 6 is rejected under 35 U.S.C. 103(a) as being unpatentable over Russell-Falla et al 
(US: 6,675,162 01/06/04) in view of Weiser et al (US-5,982,507 1 1/09/99) in further view of 
W3Cs, "Extensible Markup Language (XML) 1.0", 02/10/98, pp. 1-2, 
http://www.w3 .org/TR/1 998/REC-xml- 1 99802 1 0. 

-In regard to dependent claim 6, Russell-Falla teach wherein a common document format 
was hypertext (HTML) web pages (column 1, lines 33-37)(Fig. 1: 12) or other like information 
content (column 3, lines 17-22; column 6, lines 25-28; column 8, lines 20-61: "file directories", 
"email messages", "database records", "other web pages"; etc). Russell-Falla does not 
specifically teach wherein a common format was an XML document. W3C teaches wherein 
using XML was notoriously well known in the art for web applications (pp. 2: Section 1.1). It 
would have been obvious to one of ordinary skill in the art at the time of the invention, for one of 
the common formats of Russell-Falla to have been XML, because W3C teaches that the XML 
format provides the benefits of being easy to create, being easy to write programs which process 
XML documents, and being human-legible and reasonably clear (pp. 2: Section 1.1). It was also 
notoriously well known in the art at the time of the invention that XML was an International 
document standard and well known for its separation of data content which was the main 
embodiment of the Russell-Falla reference (column 4, lines 59-66; column 8, lines 20-38). 
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Response to Arguments 

7. Applicant's arguments, see Page 2, filed 09/20/04, with respect to the rejection(s)of 
claim(s) 1-20 under 35 U.S.C. 103(a) as being unpatentable over Smadja (US: 6,621,930 
09/16/03) have been fully considered and are persuasive. Therefore, the rejection has been 
withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view 
of newly applied prior art references as discussed above in the rejection of the claims. 

The Examiner notes that the present application does indeed receive benefit of the 
accompanying provisional application as noted above and that the rejection under Smadja (US: 
6,621,930 09/16/03) in the previous rejection was improper because Smadja did not qualify as 
valid prior art. 

Conclusion 

8. The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure. 

US-5,687,364 11-1997 Saund et al. 

US-6,1 19,1 14 09-2000 Smadja, Frank 

US-5,897,645 04/27/99 Watters 

9. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Adam L Basehoar whose telephone number is (571)-272-4121. 
The examiner can normally be reached on M-F: 7:00am - 4:00pm. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Steve Hong can be reached on (703) 308-5465. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 



ALB 



STEPHEN HONG 
SUPERVISORY PATENT EXAMINER 




